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ABSTRACT 

An  arbitrary  nonlinear  system  with  input  a Gaussian  process,  which  is 
such  that  its  output  process  has  finite  second  moments,  admits  two  kinds  of 
representations;  the  first  in  terms  of  a sequence  of  deterministic  kernels 
and  the  second  in  terms  of  a single  stochastic  kernel.  We  consider  here  the 
identification  of  the  sequence  of  deterministic  kernels  from  the  input  and 
output  processes,  the  representation  of  the  system  output  when  its  input  is 
a sample  function  of  the  Gaussian  process,  and  the  relationship  of  the  se- 
quence of  kernels  mentioned  above  .to  the  VoLterra  expansion. kernels  when 
the  system  has  a Volterra  representation. 


1.  STOCHASTIC  AND  MULTIPLE  WIENER  INTEGRALS 
FOR  GAUSSIAN  PROCESSES 


Let  us  first  introduce  our  basic  notation  and  terminology.  We  will 
consider  throughout  a zero  mean  Gaussian  process  X = (X  , teT)  with  covar- 
iance function  R(t,s),  defined  on  a probability  space  (ft,B,P).  T will  be  an 
interval  of  the  real  line.  is  usually  taken  to  be  B(X) , the  a-field  gen- 
erated by  the  process  X,  or  B(X),  the  completion  of  B(X).  There  are  two 
important  Hilbert  spaces  associated  to  a Gaussian  process.  The  nonlinear 
space  of  X,  l^X)  = L^ (fi,8(X) ,P) , consists  of  all  B(X) -measurable  random  var 

iables  with  finite  second  moment  which  are  called  (nonlinear)  L2~functionals 
of  X.  The  linear  space  of  X,  H(X) , is  the  closed  subspace  of  L^fX)  spanned 
by  X^.,  teT,  and  its  elements  are  called  linear  L^-functionals  of  X. 

The  first  useful  notion  in  the  study  of  the  nonlinear  space  of  a Wiener 
process  is  the  Multiple  Wiener  Integral.  This  notion  was  first  introduced 
by  Wiener  (1938),  who  termed  it  "Polynomial  Chaos,"  and  was  redefined  in  a 
somewhat  deeper  way  by  Ito  (1951).  Ito  showed  that  his  multiple  integrals 
of  different  degree  have  the  important  property  of  being  mutually  orthogonal 
and  also  presented  their  connection  with  the  celebrated  Eourier-Hermite  ex- 
pansion of  L2~functionals  of  Cameron  and  Martin  (1947) . In  his  important 

work  on  nonlinear  problems  Wiener  (1958)  reinterpreted  the  multiple  Wiener 
integrals  for  a Wiener  process  in  an  extremely  simple  and  intuitive  way  and 
made  some  interesting  applications. 


In  [4]  multiple  Wiener  integrals  of  the  following  two  types  are  defined 
for  general  Gaussian  processes: 


I 


(f  ) = /•••/  f(tlf...,t  )dxt  ...dxt 


P P 


(fp)  = /•**/  f(t1,...,t  )xt  ...xt  dtr..dt 


P P 


-1  P " . J 

where  p = 1,2,...,  and  we  always  write  / for  fj.  We  always  assume  that 
X = 0 a.s.  for  some  tncT  when  dealing  with  integrals  I , and  that  X is 

V 0 P 

mean  Square  continuous  when  dealing  with  integrals  J^. 
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The  integrands  f(t^,...,t  ) belong  to  appropriate  Hilbert  spaces  of 

"functions"  defined  on  tP,  which  are  denoted  by  A2(®Pr)  for  J , and  X2(©Pr) 

for  Jp.  Tor  instance  A2(R)  is  the  completion  of  the  set  of  all  functions 

f(t)  on  T which  are  such  that  the  Riemann  integral  R//f(t)f(s)  R(t,s)dt  ds 
exists  and  is  finite  with  respect  to  the  inner  product 

<f,g>  = R//f(t)f(s)  R(t,s)dt  ds  . 

When  T = [a,b],  X2(R)  contains  all  square  integrable  functions  over  T,  as 
well  as  further  interesting  classes  of  functions  (see  [4]),  but  also  "func- 
tions" with  properties  similar  to  those  of  delta  functions.  X2(0Pr)  is  de- 
fined similarly,  and  is  isomorphic  to  the  tensor  product  G^X^fR)  under  the 
natural  correspondence. 


The  multiple  Wiener  integrals  Jp,  p=l,2,...,  have  the  following  proper- 
ties (f,geX2(0PR)  and  a,b  real  numbers): 

Jp(af+bg)  = aJp(f)  + bJp(g)  , 

J (f)  = J (f)  , 

P P 

<Jp(f)-  JP(S,>L2(X)  =P'<?-8\(«PR)  ’ 1 


where  f is  the  symmetric  version  of  f, 
<Jp(f),Jq(q)>L2(x)  =0  ifp/q, 


V*1 


•®  *k  > = HP  (^♦1Ct)Xtdt)...H  (/$k(t)Xtdt) 


' 


■ft 


where  <J>j, • • • »^k  are  ort^onormal  in  ^(R),  Pj+...+pk  = p,  Hp  denote  the  Her- 
mite  polynomials,  and  0 denotes  symmetric  tensor  product. 

Also  every  ^-functional  0 of  X,  BeL^CX),  has  an  orthogonal  development 

6 - E(0)  * I J (f  ) 

P=1  V V 

where  fpeX2(0^R),  and  if 

00  OO 

e - E(0)  = l J (f  ) = l J (g J 


p=i  P'  p 

then  fp  = gp,  p= 1 , 2 , . . . . 


1 P P 
p=l  v v 


The  second  useful  notion  in  the  study  of  the  nonlinear  space  of  a Wiener 
process  is  the  stochastic  integral.  The  stochastic  integral  was  first  intro- 
duced by  Ito  (1944)  for  the  Wiener  process  and  later  generalized  by  Meyer 
for  martingales.  Every  L^-functional  of  a Wiener  process  has  a representa- 
tion as  a stochastic  integral,  where  the  integrand  is  adapted  to  the 
Wiener  process. 


In  [4j  a stochastic  integral 


1(f)  = /f(t)dxt 


is  defined  for  L^fX) -valued  "functions"  f(t)  belonging  to  the  Hilbert  space 

A . . (R) , which  is  defined  in  a way  similar  to  A (R)  and  is  isomorphic  to 
2 J L*2 1 XJ  2 


the  tensor  product  A2(R)0L2(X)  under  the  correspondence  <K  <—>(p8£,  4>e A2 CR)  » 
Cel^CX).  The  stochastic  integral 


I: 


A2;L2(X) 


(R) 


l2(X) 


is  an  unbounded  densely  defined  closed  linear  map  with  range  the  set  of  all 
zero  mean  random  variables  in  L^CX)  (for  further  properties  and  some  evalu- 
ations of  the  stochastic  integral  see  [4]).  Thus  every  I^-functional  0 of  X, 
Gel^fX),  admits  the  representation 

0 = E(0)  + / f(t)dXt 


for  some  (non-unique)  feA 


2;L2(x) 


(R) 


in  the  domain  of  I, 


and  f may  be  taken  to 


be  adapted  to  X ("adapted  to"  meaning  "measurable  with  respect  to  the  past 
of").  Notice  that  as  shown  in  [4],  the  stochastic  integral  becomes  norm 
preserving  when  restricted  to  nonanticipatory  integrands,  but  it  is  not  yet 
known  which  L^-functionals  admit  nonanticipatory  stochastic  integral  repre- 
sentations ("nonanticipatory"  meaning  "independent  of  the  future  increments 
of  X"). 


2.  NONLINEAR  SYSTEMS  WITH  GAUSSIAN  INPUTS 


Let  0 be  a nonlinear  system  with  input  the  Gaussian  process  X = {X^.teT} 
and  output  the  process  Y = (Y^.teT)  (the  parameter  sets  of  the  input  and 

output  processes  could  of  course  be  distinct) . The  only  assumption  we  make 
on  the  nonlinear  system  without  further  notice  is  that  it  is  such  that  the 
process  Y is  of  second  order,  i.e.  each  Yt  is  an  L2-functional  of  X.  Then 
from  Section  1,  we  have  two  representations  of  the  system  0 for  the  input  X. 
The  first  is  a series  representation  in  terms  of  multiple  Wiener  integrals 


Yt  = E(V  + E /***/  fnCt*tl'-**‘tD)Xt  --^t  dtr-,dtD  (1) 

P=1  p p 1 p v 

where  f (tjOe^C®^)  may  (and  will  from  now  on)  be  taken  to  be  symmetric. 
The  second  is  a stochastic  integral  representation 


Yt  = E( Yt)  + / f(t;s)Xsds 


(2) 


where  f(t;* 


)cA 


2;L2(X) 


(R)  may  be  taken  to  be  adapted  to  X. 


Thus  the  system 


0 can  be  represented  for  the  input  X either  by  the  sequence  of  deterministic 
kernels  f (tjtj, . . . ,tp) , p=l,2,...,  as  in  (1),  or  by  the  single  stochastic 

kernel  f(t,s)  as  in  (2).  It  should  be  emphasized  that  both,  the  sequence  of 
deterministic  kernels  {fp}  and  the  stochastic  kernel  f,  depend  not  only  on 


the  system  0 but  also  on  the  input  Gaussian  process  X.  Thus  distinct  input 
processes  will  always  produce  distinct  stochastic  kernels  (unless  of  course 
the  system  is  linear)  in  representation  (2),  and  will  in  general  produce 
distinct  sequences  of  deterministic  kernels  in  representation  (1). 

From  now  on  we  concentrate  on  the  representation  (1).  Notice  that  ex- 
pansion (1)  looks  very  much  like  a Volterra  kernel  expansion,  with  the  im- 
portant difference  that  the  multiple  integrals  are  multiple  Wiener  rather 
than  Lebesgue  integrals;  it  could  thus  be  considered  as  a stochastic  Volterra 
kernel  expansion.  Several  system  synthesis  or  identification  problems  natu- 
rally suggest  themselves  at  this  point: 


(PI)  Knowing  the  input  and  output  processes  X and  Y,  find  the  kernels 

fp , p= 1,2,...  . 

(P2)  Knowing  the  kernels  f^,  p=l,2,...,  in  the  representation  of  the 

system  for  the  Gaussian  input  X,  can  one  represent  the  output  of 
the  system  to  another  Gaussian  input  or  to  a deterministic  input? 

(P3)  Assuming  that  the  system  0 when  acting  on  deterministic  inputs 
in,  say,  I^fa.b]  has  a Volterra  kernel  expansion 


y(t)  = kQ(t)  ♦ l L/***/k  (t;t1,...,t  )x(t1)...x(t  ldt^.-dt 


(3) 


p=l 


where  L denotes  Lebesgue  integral,  y = 0(x)  and  k^eL^ [a,b]P) , 
what  is  the  relationship  between  the  sets  of  kernels  {f  },  and 

V?  p 


In  the  following  we  consider  problems  (PI)  and  (P3) , which  are  straightfor- 
ward, and  we  begin  an  exploration  of  the  seemingly  harder  problem  (P2) . The 
analysis  is  based  on  the  structure  developed  in  [4]. 


We  begin  with  problem  (PI).  Let  {<}>  } be  any  complete  orthonormal  set 


in  A2(R).  Then  we  have  for  every  teT  (omitting  the  arguments  t ^ , 


-V 


- l a, 


Y 


Y*,nk 


pk  ®px 

(t)  4> 


n. 


n,. 


(4) 


where  the  sum  is  taken  over  all  k=l,...,p,  pj+...+pk=p,  and  n^...,^  = 
1,2,...  and  converges  in  anc*  coefficients  are  given  by 

®PX  ~ 

*■  a 


p.l...p.  !a 


(t)  = p!  <f  (t),  <J> 


^ p' 


n. 


®Pk 

\ >X2C®Pr) 


, ®PX  ~ ®Pk  , 

= E{J„(^(t))J„(4>„  ®...e  4>n  k)} 


P P 


P n, 


= E{Y  H (/*  (s)X  ds)...H  (/ <0  (s)X  ds)}  . (5) 

P1  nl  S Pn  nk  5 

Hence  from  the  input  and  output  processes  X and  Y we  can  find  the  coeffi- 
cients a from  (5)  and  thus  the  kernels  fp(t)  from  (4) . Notice  that  the 


functions  4>  (t)  can  be  chosen  by  orthonormalizing  (in  A (R)  o^  course)  any 
n ^ -U 

set  of  functions  complete  in  L^CO;  or  else  by  putting  <+>n  = >n  where  { > 

and  {en)  are  the  eigenvalues  and  eigenfunctions  of  R(t,s)  (1].  This  method 
of  determining  f^  has  of  course  all  the  usual  disadvantages  of  a series  ex- 
pansion. When  T = (-°°,+“),  X is  stationary  and  f^(t;  , . . . ,tp)  = f^Ctj-t, 

...,tp-t),  p=l,2,...,  then  a somewhat  simpler  method  for  approximating  f 
can  be  found  but  we  will  not  go  into  this  here. 

Problem  (P2)  is  the  most  interesting  as  well  as  the  most  difficult  one. 
Of  course,  if  one  is  willing  to  put  strong  assumptions  on  the  system,  like 
those  in  problem  (P3),  then,  as  we  shall  see,  the  situation  is  fairly 
straightforward.  Thus  the  main  point  of  problem  (P2)  is  to  investigate  con- 
ditions on  the  system,  much  weaker  than  those  made  in  problem  (P3) , under 
which  appropriate  positive  answers  to  problem  (P2)  can  be  given.  Here  we 
consider  only  the  relationship  of  the  representation  of  the  system  for  the 
Gaussian  input  X to  its  representation  for  a deterministic  input  which  is  a 
sample  function  of  X. 

From  (1)  and  (4)  we  have 


- E(YJ  = l J (f  ) 
t v L-,  V v 

p=l  Y v 


- I l 

p=l  k=l,...,p 

Pl+***+Pk=P 


pr-*pk  r ®Pi~  ~ ®Pki 

a (t)J  1 " 1 

nl‘ ' ’nk  p 


4> 


9 • • • » 1 f 2 f 0 0 » 


Then  writing  H (£)  = £p  c?£P,  for  a standard  normal  random  variable  £,  we 
have  _ P J J 

®Pi  ®p. 

J(6  k)  = H (£  ) • • • H a ) 


P'Tn, 


Pi  nl  Pt  ni 


k “k 

Y pl  pkfjl  A 

1 c.  . • «c . £ ■ ■ •£ 

si=0,...,pi  J1  Jk  nl  "k 

Is 1)  « • • 


i ‘•'•X/"X15'--5v)ct1 

Ji=0,...,p,  ■'1  •’k  1 Is.  -’ll 


Pl  Pk(f  rf9h~ 
c.  ...c  L/--*/(( 

Pi  31 

i=li * 1 1 }k 

where  j=j  j*. . and  if  we  define 

iN 

q'  ' i q 


dtj. . .dtj  , 


N Pi  Pk  P,  ---P., 

h ) = l l l c X.c  ka  1 k(t) 

p=q  k=l,...,p  Pl+---+Pk=P  3l  Jk  nl*  * K 

nl* ‘ * ,nk=l » • * * *N  31+---+3k=q 

0J. 

A.  A 


f 'Jl~  ~ 

(X  0*'*®X  j (ti»  • • • *X 


we  obtain  by  a simple  rearrangement  of  terms 

OO 

Yt  ’ E(Yt)  = E lim  L/-*-/h^(t;t  ,t  )X  ...X  dt  ...dt  (6) 

q=0  N-^°  ^ * L1  q " 

Since  the  convergence  in  (6)  is  in  mean  square,  along  some  subsequence  we 
will  have  convergence  with  probability  one.  Thus  we  can  write 

N M 

n N 

Yt  - E(Y  ) = lim  l L/***/h  n(t;t  ,...,t  )X  ...X  dt  ...dt  a.s.  (7) 
1 n-*»  q=0  q 1 q *1  q 1 q 

Thus  for  almost  every  sample  function  of  the  process  X,  the  output  of  the 
system  can  be  represented  by  (7).  Notice  that  the  kernels  h^'  in  (7)  can  be 

N q 

found  from  the  kernels  f and  that  h (t;t,,...,t  ) are  continuous  functions 

P fli  p 

in  if  we  choose  (as  we  can)  the  functions  <f>  (t)  to  be  continuous. 

1 p n ' 

Hence  knowing  the  representation  of  the  system  output  to  a Gaussian  input  we 
can  find  the  representation  of  the  system  output  for  a certain  class  of 
deterministic  inputs,  namely  almost  all  sample  functions  of  the  Gaussian 
process.  This  deterministic  input-output  representation,  given  by  (7),  de- 
pends of  course  on  the  covariance  R of  the  Gaussian  process  in  the  follow- 
ing two  ways: 

N 

(i)  R determines  the  kernels  h of  the  representation  (7) , and 

H 

(ii)  R determines,  up  to  a zero  probability  set,  the  deterministic 
functions  for  which  representation  (7)  is  valid. 

Representation  (7)  has  the  following  two  remarkable  features.  First,  even 
though  it  is  valid  for  a small  class  of  deterministic  inputs  and  its  kernels 
depend  on  that  class,  it  was  obtained  under  extremely  weak  assumptions  on 

2 

the  system,  namely  E(Y^)  < <*».  Second  it  is  identical  in  form  to  the  repre- 
sentation 

N Nn  N 

y(t)  = limfk  n(t)  + l /./••  */k  n(t;t  , . . . ,t  )x(t  ) . . .x(t  )dt  . . .dt  } 

q=l  4 H 4 


obtained  by  Frechet  (1910)  for  all  x(t)  in  C[a,b]  or  in  L2[a,b]  under  the 
assumption  that  the  system  is  continuous,  in  the  sense  that  for  each  fixed 
t,  y(t)  is  a continuous  functional  on  c[a,b]  or  L>2[a,b];  in  (8)  the  kernels 

k^  depend  only  on  the  system  and  not  on  the  particular  input  x(t)  in 

C(a,b]  or  in  L2[a,b].  It  is  thus  remarkable  that  a representation  like  (8), 
valid  not  for  all  but  only  for  some  functions  in  I^ta.b]  (or  in  C[a,b]  if 
X has  continuous  sample  functions) , was  derived  with  no  continuity  assump- 
tions on  the  system. 

If,  furthermore,  it  turns  out  that  for  each  xi=l,2, . . . , the  kernels 
hq(t)  converge  in  L2([a,b]q)  to  h^(t)  (which  constitutes  an  additional  re- 
striction on  the  system)  then  (6)  and  thus  (7)  may  be  simplified  to 

N 

n 

Y - E(Y  ) = lim  l L/***/h  (t;t  ,...,t  )X  ...X  dt  ...dt  a.s.  (9) 
ir*»  q=0  4 q 1 q q 


finally,  wo  should  remark  that  if  the  system  acting  on  the  Gaussian 
input  X is  of  finite  order  P,  in  the  sense  that  in  (1)  f{)  = 0 for  p > P, 


Y*  - E(Y  ) = £ /• • */f  (t ; t . , . . . , t ) X ...X  dt....dt  , 

1 p=l  P 1 p z\  cp  1 P 


(10) 


then  it  has  the  same  order  P when  acting  on  the  sample  functions  of  X,  in 
the  sense  that  (7)  is  written  as 

P N 

Yt  - E(Yt)  = lim  l /•••/ h n(t;t1,...,t  )Xt  ...Xt  dtr..dt  a.s.  (11) 


ir**>  p=0 


1 


and  if,  furthermore,  each  h^(t)  converges  in  L2([a,b]P)  to  h (t)  we  have 
P 

Yt  ' E(V  = ^ /••*/hp(t;t1 t )X  ...X  dt  ...dt  a.s.  (12) 

p=0  p v 1 p p 

It  may  be  checked  (even  though  the  calculations  are  somewhat  messy)  that  in 
the  latter  case  we  always  have 


hp  ^P  " ^p-i 


N. 


the  assumption  on  the  convergence  of  the  h (t)*s  implying  that  the  kernels 
fp(t)  belong  to  L2([a,b]p).  p 

We  finally  consider  problem  (P3)  which  consists  in  finding  the  kernels 

(k  } when  the  kernels  (f  } are  known,  and  vice  versa.  From  (3)  we  have 

p P 


Yt  = k0(t)  + ^ LJ"‘fk ■p(t;t1,...,t  )Xt  •••xt  dt1...dt  a.s. 


p=l 


(13) 


On  the  other  hand  representation  (1)  implies  (7)  which  in  view  of  (13)  can 
be  written  as  in  (9).  Now  it  follows  from  (9)  and  [13)  that 

k0(t)  = E(Yt)  ♦ hQ(t) 

and  for  p=l,2, . . . 

!./•••/  kp(t;t1,...,tp)Xt  ...Xt  dtj . . ’dtp 


(14) 


F 1 p 

= L/‘**/h  (t;t,,...,t  )X.  ...X,  dt,...dt  a.s. 

P 1 p tl  lp  1 P 

Jf  < is  the  subspace  of  L2([a,b]p)  generated  by  the  symmetric  functions 

{X  (a))... X (o>),  weft-12  } where  ft-  is  the  exceptional  set  in  (14),  then  (14) 

1 p u u 

is  equivalent  to 

hp(t)  = Proj.  k(t)  . 

MX 

Thus,  in  general,  knowledge  of  the  kernels  (f  },  and  thus  the  kernels  {h  }, 

P T)  P 

determines  only  the  projections  of  the  kernels  k onto  M£.  The  kernels  k 

p a p 

will  be  determined  from  the  kernels  f , by  k = h , only  if  the  subspaces 

P P P 


consist  of  all  symmetric  functions  in  l^Qa.b]^).  An.  equivalent  condi- 
tion is  that  if 


Vtr’“'Vsi'"”V  E(XV"Xtp  Xsi 


,X  ) 
s * 
P 


and  K denotes  also  the  integral  type  operator  in  L ([a,b]p)  with  kernel 

P 1 h 

R , then  R should  be  strictly  positive  definite,  or  the  null  space  of  R" 
P P v P 

should  be  {0 } . 


Conversely,  knowledge  of  the  kernels  kp  clearly  determines  the  kernels 
fp.  This  is  quite  obvious  from  (13).  The  precise  relationship  can  be  de- 
rived by  using  the  property 


i*---*Vxtl---xt  dtr ••dtP  = ProK\ 

p p 

oo 

= l *-/'  **/k  (t;t, , . . . ,t  JProj  (X  ...X  )dt  ...dt 


q=p 


«P 


(15) 


where  is  the  closure  in  I^fX)  of  Q^,  the  set  of  all  polynomials  in  the 

elements  of  H(X)  with  degree  p which  are  orthogonal  to  all  polynomials  with 
degree  < p-1.  The  projections  of  (X  ...X  ) onto  can  be  expressed  by 

*1  q P 

using  the  Grad-Barrett-Hermite  polynomials  (see  Section  II  of  Root  (1965)) 
and  then  (15)  will  give  each  f^  in  terms  of  k^,  kp+2»***  • When  only  the 

first  P terms  in  (13)  are  present,  the  same  will  be  true  for  (1)  and  in  this 
case  fp  can  be  expressed  in  terms  of  kp,  fp  ^ in  terms  of  kp  j,  fp  ^ in 

terms  of  kp  2 and  kp,  fp  3 in  terms  of  kp  ^ and  kp_j,  etc.  Again  we  will 

have  in  fact  fp  = kp  and  fp  ^ = kp  ^ as  shown  in  Root  (1965) , where  the 

specific  expressions  for  P = 3 are  also  given. 
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